GradQuant: Introduction to Network Analysis in igraph

Network Analysis Introduction

What is a network?

A network graph depicts interconnections between nodes.

The presence or absence of each interconnection indicates whether there exists some form of connection between each pair of nodes.

Examples

  • Google Maps
  • Friendships between individuals
  • Flight routes between cities
  • Connections between brain regions.
  • Communications networks, telephone, satellite, cable and internet
  • Discrete words/phrases and their linguistic relations
  • Atomic or molecular structures

Terms

Nodes:

Install and Load Relevant Packages

# install.packages("igraph") # For network analysis
# install.packages("igraphdata") # For sample datasets
# install.packages("RColorBrewer") # For colors
library(igraph)
library(igraphdata)
library(RColorBrewer)

Structures

Below is a randomly created adjacency matrix. ‘1’ denotes an edge (i.e., connection), and ‘0’ denotes the absence of an edge

set.seed(52)
pilotAdj <- sapply(1:10, function(x) sample(c(0,1),10,replace=T))
diag(pilotAdj) <- 0 # No self-connections
print(pilotAdj)
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    1    0    1    0    0    1     0
 [2,]    1    0    1    0    1    0    1    1    0     0
 [3,]    0    0    0    0    1    1    0    1    1     0
 [4,]    0    1    0    0    0    0    0    0    0     1
 [5,]    1    0    1    0    0    0    0    1    1     0
 [6,]    0    1    0    0    1    0    1    0    1     0
 [7,]    0    1    1    0    1    0    0    1    0     0
 [8,]    1    1    1    1    0    1    0    0    0     0
 [9,]    0    1    0    1    1    1    1    0    0     1
[10,]    1    1    0    0    1    1    0    0    1     0

This may seem abstract so let’s add some labels to the nodes to make it more meaningful:

people <- c("Luke","Anne","Zee","Yrian","Eleanor","Jazmin","Juan","Kalani","Julia","Brent")
colnames(pilotAdj) <- people
rownames(pilotAdj) <- people
pilotAdj
        Luke Anne Zee Yrian Eleanor Jazmin Juan Kalani Julia Brent
Luke       0    0   0     1       0      1    0      0     1     0
Anne       1    0   1     0       1      0    1      1     0     0
Zee        0    0   0     0       1      1    0      1     1     0
Yrian      0    1   0     0       0      0    0      0     0     1
Eleanor    1    0   1     0       0      0    0      1     1     0
Jazmin     0    1   0     0       1      0    1      0     1     0
Juan       0    1   1     0       1      0    0      1     0     0
Kalani     1    1   1     1       0      1    0      0     0     0
Julia      0    1   0     1       1      1    1      0     0     1
Brent      1    1   0     0       1      1    0      0     1     0

Adjacency Matrix

In the adjacency matrix representation, a graph is represented in the form of a two-dimensional array. The size of the array is V x V, where V is the set of vertices. The following image represents the adjacency matrix representation:

Converting an Adjacency Matrix to a Graph

An Adjacency list is an array consisting of the address of all the linked lists. The first node of the linked list represents the vertex and the remaining lists connected to this node represents the vertices to which this node is connected. This representation can also be used to represent a weighted graph. The linked list can slightly be changed to even store the weight of the edge.

pilotGraph <- graph_from_adjacency_matrix(pilotAdj)
pilotGraph
IGRAPH 38ae74e DN-- 10 42 -- 
+ attr: name (v/c)
+ edges from 38ae74e (vertex names):
 [1] Luke   ->Yrian   Luke   ->Jazmin  Luke   ->Julia   Anne   ->Luke   
 [5] Anne   ->Zee     Anne   ->Eleanor Anne   ->Juan    Anne   ->Kalani 
 [9] Zee    ->Eleanor Zee    ->Jazmin  Zee    ->Kalani  Zee    ->Julia  
[13] Yrian  ->Anne    Yrian  ->Brent   Eleanor->Luke    Eleanor->Zee    
[17] Eleanor->Kalani  Eleanor->Julia   Jazmin ->Anne    Jazmin ->Eleanor
[21] Jazmin ->Juan    Jazmin ->Julia   Juan   ->Anne    Juan   ->Zee    
[25] Juan   ->Eleanor Juan   ->Kalani  Kalani ->Luke    Kalani ->Anne   
[29] Kalani ->Zee     Kalani ->Yrian   Kalani ->Jazmin  Julia  ->Anne   
+ ... omitted several edges

Converting an igraph to an edgelist

An edge list is a list or array of all the edges in a graph. Edge lists are one of the easier representations of a graph. In this implementation, the underlying data structure for keeping track of all the nodes and edges is a single list of pairs. Each pair represents a single edge and is comprised of the two unique IDs of the nodes involved. Each line/edge in the graph gets an entry in the edge list, and that single data structure then encodes all nodes and relationships.

pilotEList <- as_edgelist(pilotGraph)
pilotEList
      [,1]      [,2]     
 [1,] "Luke"    "Yrian"  
 [2,] "Luke"    "Jazmin" 
 [3,] "Luke"    "Julia"  
 [4,] "Anne"    "Luke"   
 [5,] "Anne"    "Zee"    
 [6,] "Anne"    "Eleanor"
 [7,] "Anne"    "Juan"   
 [8,] "Anne"    "Kalani" 
 [9,] "Zee"     "Eleanor"
[10,] "Zee"     "Jazmin" 
[11,] "Zee"     "Kalani" 
[12,] "Zee"     "Julia"  
[13,] "Yrian"   "Anne"   
[14,] "Yrian"   "Brent"  
[15,] "Eleanor" "Luke"   
[16,] "Eleanor" "Zee"    
[17,] "Eleanor" "Kalani" 
[18,] "Eleanor" "Julia"  
[19,] "Jazmin"  "Anne"   
[20,] "Jazmin"  "Eleanor"
[21,] "Jazmin"  "Juan"   
[22,] "Jazmin"  "Julia"  
[23,] "Juan"    "Anne"   
[24,] "Juan"    "Zee"    
[25,] "Juan"    "Eleanor"
[26,] "Juan"    "Kalani" 
[27,] "Kalani"  "Luke"   
[28,] "Kalani"  "Anne"   
[29,] "Kalani"  "Zee"    
[30,] "Kalani"  "Yrian"  
[31,] "Kalani"  "Jazmin" 
[32,] "Julia"   "Anne"   
[33,] "Julia"   "Yrian"  
[34,] "Julia"   "Eleanor"
[35,] "Julia"   "Jazmin" 
[36,] "Julia"   "Juan"   
[37,] "Julia"   "Brent"  
[38,] "Brent"   "Luke"   
[39,] "Brent"   "Anne"   
[40,] "Brent"   "Eleanor"
[41,] "Brent"   "Jazmin" 
[42,] "Brent"   "Julia"  

Adjacency List

In the adjacency list representation, a graph is represented as an array of linked list. The index of the array represents a vertex and each element in its linked list represents the  vertices that form an edge with the vertex. The following image represents the adjacency list representation: 

Converting to adjacency list

as_adj_list(pilotGraph)
$Luke
+ 7/10 vertices, named, from 38ae74e:
[1] Anne    Yrian   Eleanor Jazmin  Kalani  Julia   Brent  

$Anne
+ 11/10 vertices, named, from 38ae74e:
 [1] Luke    Zee     Yrian   Eleanor Jazmin  Juan    Juan    Kalani  Kalani 
[10] Julia   Brent  

$Zee
+ 8/10 vertices, named, from 38ae74e:
[1] Anne    Eleanor Eleanor Jazmin  Juan    Kalani  Kalani  Julia  

$Yrian
+ 5/10 vertices, named, from 38ae74e:
[1] Luke   Anne   Kalani Julia  Brent 

$Eleanor
+ 10/10 vertices, named, from 38ae74e:
 [1] Luke   Anne   Zee    Zee    Jazmin Juan   Kalani Julia  Julia  Brent 

$Jazmin
+ 9/10 vertices, named, from 38ae74e:
[1] Luke    Anne    Zee     Eleanor Juan    Kalani  Julia   Julia   Brent  

$Juan
+ 7/10 vertices, named, from 38ae74e:
[1] Anne    Anne    Zee     Eleanor Jazmin  Kalani  Julia  

$Kalani
+ 9/10 vertices, named, from 38ae74e:
[1] Luke    Anne    Anne    Zee     Zee     Yrian   Eleanor Jazmin  Juan   

$Julia
+ 11/10 vertices, named, from 38ae74e:
 [1] Luke    Anne    Zee     Yrian   Eleanor Eleanor Jazmin  Jazmin  Juan   
[10] Brent   Brent  

$Brent
+ 7/10 vertices, named, from 38ae74e:
[1] Luke    Anne    Yrian   Eleanor Jazmin  Julia   Julia  

Converting igraph back to an adjacency matrix

Notably, you might see that an adjacency matrix is a “sparse matrix” and is distinct from the base R class of “matrix”. To convert it, you will need to go a step further.

pilotAdj <- as_adjacency_matrix(pilotGraph)
pilotAdj
10 x 10 sparse Matrix of class "dgCMatrix"
                           
Luke    . . . 1 . 1 . . 1 .
Anne    1 . 1 . 1 . 1 1 . .
Zee     . . . . 1 1 . 1 1 .
Yrian   . 1 . . . . . . . 1
Eleanor 1 . 1 . . . . 1 1 .
Jazmin  . 1 . . 1 . 1 . 1 .
Juan    . 1 1 . 1 . . 1 . .
Kalani  1 1 1 1 . 1 . . . .
Julia   . 1 . 1 1 1 1 . . 1
Brent   1 1 . . 1 1 . . 1 .

Converting sparse matrix to base matrix

pilotBaseM <- as.matrix(pilotAdj)
pilotBaseM
        Luke Anne Zee Yrian Eleanor Jazmin Juan Kalani Julia Brent
Luke       0    0   0     1       0      1    0      0     1     0
Anne       1    0   1     0       1      0    1      1     0     0
Zee        0    0   0     0       1      1    0      1     1     0
Yrian      0    1   0     0       0      0    0      0     0     1
Eleanor    1    0   1     0       0      0    0      1     1     0
Jazmin     0    1   0     0       1      0    1      0     1     0
Juan       0    1   1     0       1      0    0      1     0     0
Kalani     1    1   1     1       0      1    0      0     0     0
Julia      0    1   0     1       1      1    1      0     0     1
Brent      1    1   0     0       1      1    0      0     1     0

Simple Inspection

Show Vertices

V(pilotGraph)
+ 10/10 vertices, named, from 38ae74e:
 [1] Luke    Anne    Zee     Yrian   Eleanor Jazmin  Juan    Kalani  Julia  
[10] Brent  

Show Edges

E(pilotGraph)
+ 42/42 edges from 38ae74e (vertex names):
 [1] Luke   ->Yrian   Luke   ->Jazmin  Luke   ->Julia   Anne   ->Luke   
 [5] Anne   ->Zee     Anne   ->Eleanor Anne   ->Juan    Anne   ->Kalani 
 [9] Zee    ->Eleanor Zee    ->Jazmin  Zee    ->Kalani  Zee    ->Julia  
[13] Yrian  ->Anne    Yrian  ->Brent   Eleanor->Luke    Eleanor->Zee    
[17] Eleanor->Kalani  Eleanor->Julia   Jazmin ->Anne    Jazmin ->Eleanor
[21] Jazmin ->Juan    Jazmin ->Julia   Juan   ->Anne    Juan   ->Zee    
[25] Juan   ->Eleanor Juan   ->Kalani  Kalani ->Luke    Kalani ->Anne   
[29] Kalani ->Zee     Kalani ->Yrian   Kalani ->Jazmin  Julia  ->Anne   
[33] Julia  ->Yrian   Julia  ->Eleanor Julia  ->Jazmin  Julia  ->Juan   
[37] Julia  ->Brent   Brent  ->Luke    Brent  ->Anne    Brent  ->Eleanor
+ ... omitted several edges

Total Vertices

gorder(pilotGraph)
[1] 10

Total Edges

gsize(pilotGraph)
[1] 42

Basic Plot

plot(pilotGraph)

What is a Directed/Undirected Graph

  • Directed Graph: Directed graphs are a class of graphs that don't presume symmetry or reciprocity in the edges established between vertices. In a directed graph, if a and b are two vertices connected by an edge (a,b), this doesn't necessarily mean that an edge connecting (b,a) also exists

  • Undirected Graph: Symmetric; Undirected graphs are more specific. For them, there's an extra assumption regarding the reciprocity in the relationship between pairs of vertices connected by an edge. If an edge (a,b) exists between two vertices a and b, the edge (b,a) also exists

Examples

Examples of directed graphs include parents and their offspring; Twitter

Examples of undirected graphs include pedestrian pathways, where movement between any two intersections of paths is possible in both directions; railways; Facebook; correlation matrices

Depiction of Directed Graph

Our prior adjacency matrix we generated is a directed graph.

print(pilotAdj)
10 x 10 sparse Matrix of class "dgCMatrix"
                           
Luke    . . . 1 . 1 . . 1 .
Anne    1 . 1 . 1 . 1 1 . .
Zee     . . . . 1 1 . 1 1 .
Yrian   . 1 . . . . . . . 1
Eleanor 1 . 1 . . . . 1 1 .
Jazmin  . 1 . . 1 . 1 . 1 .
Juan    . 1 1 . 1 . . 1 . .
Kalani  1 1 1 1 . 1 . . . .
Julia   . 1 . 1 1 1 1 . . 1
Brent   1 1 . . 1 1 . . 1 .

Depiction of Undirected Graph

set.seed(48)
pilotAdj2 <- pilotAdj <- sapply(1:10, function(x) sample(c(0,1),10,replace=T))
pilotAdj2[lower.tri(pilotAdj2)] = t(pilotAdj2)[lower.tri(pilotAdj2)]
diag(pilotAdj2) <- 0
pilotAdj2
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    1    1    0    1    1    0     0
 [2,]    0    0    1    1    0    0    1    0    0     0
 [3,]    0    1    0    1    1    0    1    1    0     1
 [4,]    1    1    1    0    0    0    1    1    1     0
 [5,]    1    0    1    0    0    0    1    0    0     0
 [6,]    0    0    0    0    0    0    1    0    0     0
 [7,]    1    1    1    1    1    1    0    1    1     1
 [8,]    1    0    1    1    0    0    1    0    1     0
 [9,]    0    0    0    1    0    0    1    1    0     0
[10,]    0    0    1    0    0    0    1    0    0     0

What is a weighted/unweighted graph

Attributes

Vertex Attributes

partyAtts <- sample(c("Dem","Rep"), gorder(pilotGraph) ,replace = T) # Random attributes
partyAtts # Inspect
 [1] "Rep" "Dem" "Rep" "Rep" "Rep" "Rep" "Dem" "Rep" "Dem" "Dem"
pilotGraph <-set_vertex_attr(pilotGraph, "party", value = partyAtts) # Assign attributes
vertex_attr(pilotGraph) # Check vertex attributes
$name
 [1] "Luke"    "Anne"    "Zee"     "Yrian"   "Eleanor" "Jazmin"  "Juan"   
 [8] "Kalani"  "Julia"   "Brent"  

$party
 [1] "Rep" "Dem" "Rep" "Rep" "Rep" "Rep" "Dem" "Rep" "Dem" "Dem"

Edge Attributes

friendshipAtts <- rnorm(gsize(pilotGraph))
friendshipAtts
 [1]  0.0003886532 -0.9084064952  0.7297462996  1.3719967855 -0.2095287169
 [6]  0.0487437067 -0.4387685857 -0.1643273084  1.9826443201 -1.0800525952
[11]  0.3165564912 -1.0852255825 -0.8968359306 -0.4142618756 -0.3524979177
[16] -0.6337354192 -0.0719193894 -0.7048606787  1.8262535280 -1.3985064735
[21] -1.1625890724 -0.8335206467  0.1487729735  1.2561356477 -0.9197363463
[26] -0.5573351508  0.9715357795  0.9341367607  0.2990071515 -0.7463273171
[31] -0.3663144421 -0.0704711686 -1.2666863813  0.8853194276  0.1243266197
[36]  0.6058275304  1.5574154511  0.4231836437 -0.8765581216  1.0745972898
[41]  0.0741053088  0.9017606859
pilotGraph <-set_edge_attr(pilotGraph, "friendship", value = friendshipAtts) # Assign attributes
edge_attr(pilotGraph) # Check vertex attributes
$friendship
 [1]  0.0003886532 -0.9084064952  0.7297462996  1.3719967855 -0.2095287169
 [6]  0.0487437067 -0.4387685857 -0.1643273084  1.9826443201 -1.0800525952
[11]  0.3165564912 -1.0852255825 -0.8968359306 -0.4142618756 -0.3524979177
[16] -0.6337354192 -0.0719193894 -0.7048606787  1.8262535280 -1.3985064735
[21] -1.1625890724 -0.8335206467  0.1487729735  1.2561356477 -0.9197363463
[26] -0.5573351508  0.9715357795  0.9341367607  0.2990071515 -0.7463273171
[31] -0.3663144421 -0.0704711686 -1.2666863813  0.8853194276  0.1243266197
[36]  0.6058275304  1.5574154511  0.4231836437 -0.8765581216  1.0745972898
[41]  0.0741053088  0.9017606859

Subsetting Graph

We may want to subset all nodes that are “Dem”:

V(pilotGraph)[party=="Dem"]
+ 4/10 vertices, named, from 38ae74e:
[1] Anne  Juan  Julia Brent

We may want to create a subsettted graph of only “Dem”:

induced_subgraph(pilotGraph, vids=which(V(pilotGraph)$party=="Dem") )
IGRAPH 6fcc7b8 DN-- 4 7 -- 
+ attr: name (v/c), party (v/c), friendship (e/n)
+ edges from 6fcc7b8 (vertex names):
[1] Anne ->Juan  Juan ->Anne  Julia->Anne  Julia->Juan  Julia->Brent
[6] Brent->Anne  Brent->Julia

We may want to subset all of the edges that include “Anne”:

E(pilotGraph)[[inc("Anne")]]
+ 11/42 edges from 38ae74e (vertex names):
     tail    head tid hid  friendship
4    Anne    Luke   2   1  1.37199679
5    Anne     Zee   2   3 -0.20952872
6    Anne Eleanor   2   5  0.04874371
7    Anne    Juan   2   7 -0.43876859
8    Anne  Kalani   2   8 -0.16432731
13  Yrian    Anne   4   2 -0.89683593
19 Jazmin    Anne   6   2  1.82625353
23   Juan    Anne   7   2  0.14877297
28 Kalani    Anne   8   2  0.93413676
32  Julia    Anne   9   2 -0.07047117
39  Brent    Anne  10   2 -0.87655812

We may want to subset all edges with a friendship greater than 0:

E(pilotGraph)[[friendship>0]]
+ 20/42 edges from 38ae74e (vertex names):
     tail    head tid hid   friendship
1    Luke   Yrian   1   4 0.0003886532
3    Luke   Julia   1   9 0.7297462996
4    Anne    Luke   2   1 1.3719967855
6    Anne Eleanor   2   5 0.0487437067
9     Zee Eleanor   3   5 1.9826443201
11    Zee  Kalani   3   8 0.3165564912
19 Jazmin    Anne   6   2 1.8262535280
23   Juan    Anne   7   2 0.1487729735
24   Juan     Zee   7   3 1.2561356477
27 Kalani    Luke   8   1 0.9715357795
28 Kalani    Anne   8   2 0.9341367607
29 Kalani     Zee   8   3 0.2990071515
34  Julia Eleanor   9   5 0.8853194276
35  Julia  Jazmin   9   6 0.1243266197
36  Julia    Juan   9   7 0.6058275304
37  Julia   Brent   9  10 1.5574154511
38  Brent    Luke  10   1 0.4231836437
40  Brent Eleanor  10   5 1.0745972898
41  Brent  Jazmin  10   6 0.0741053088
42  Brent   Julia  10   9 0.9017606859

Basic Plotting by Attributes

You can conditionally assign colors based on some attribute and plot based off that attribute. igraph vertices have an attribute for color.

V(pilotGraph)$color <- ifelse(V(pilotGraph)$party == "Dem", "blue", "red")
plot(pilotGraph, vertex.label.color="black")

Using a Real Dataset: ‘UK Faculty’

data("UKfaculty")

Can you inspect the UK

# select colors
colors = brewer.pal(4, "Dark2")
# assign colors to groups
V(UKfaculty)$color = sapply(V(UKfaculty)$Group, function(x) colors[x])

plot(UKfaculty, layout = layout_nicely(UKfaculty, dim = 2),
     vertex.color = V(UKfaculty)$color, vertex.frame.color = NA,
     vertex.label = NA, vertex.shape = 'square',
     vertex.size = 3.5, edge.arrow.size = 0.3, edge.curved = TRUE,
     edge.width = E(UKfaculty)$weight ^ 0.8,
     edge.color = rgb(0, 0, 0, alpha = 0.1))
title("UK Faculty Friendship Network (Directed)", cex.main = 1)